In the previous version of this notebook attempted to run the CPAN module for InterologWalk. There were problems installing this and getting it to run locally. It turned out that it had already been run over a large set of proteins at Edinburgh and that the output file was available, which makes this task much easier.

Looking at this file and loading it:


In [1]:
cd ../../InterologWalk/


/home/gavin/Documents/MRes/InterologWalk

In [2]:
ls


IW_entrez.csv

In [3]:
!head IW_entrez.csv












In [4]:
import csv

Creating dictionary and feature object

As was done with the STRING notebook we will create a ocbio.ppipred.features object to store the dictionary of interactions. This can then be pickled and loaded when assembling feature vectors.


In [6]:
f = open("IW_entrez.csv")
featuredict = {}
for line in csv.reader(f,delimiter="\t"):
    featuredict[frozenset(line)] = ['1']
f.close()

In [8]:
import sys

In [9]:
sys.path.append("../opencast-bio/")

In [10]:
import ocbio.ppipred

In [11]:
features = ocbio.ppipred.features(featuredict,1)

Testing

Testing with arbitrary keys:


In [12]:
realkey = featuredict.keys()[0]
fakekey = frozenset(["1234","4321"])

In [13]:
features[realkey]


Out[13]:
['1']

In [14]:
features[fakekey]


Out[14]:
['0']

Pickling

Finally, we pickle this instance so that it can be accessed by the assembler to create feature vectors:


In [15]:
import pickle

In [16]:
f = open("human.interologwalk.features.pickle","wb")
pickle.dump(features,f)
f.close()